Motion vector field error estimation

ABSTRACT

A technique is disclosed for estimating the measurement error in motion vectors used for example in a motion compensated video signal process. For each motion vector corresponding to a region of an image a plurality of temporal and spatial image gradients are calculated corresponding to that region. From the constraint equations of the image gradients a plurality of error values can be calculated for each motion vector and a parameter generated describing the size of the distribution of motion vector measurement errors. Subsequent processing of the video signals using the motion vectors can then be adapted, for example by graceful fallback in motion compensated interpolation, depending on the accuracy of each motion vector. The ‘confidence’ in the accuracy of each motion vector can be described by a parameter calculated in relation to the size of the error distribution and the motion vector speed.

BACKGROUND OF THE INVENTION

The invention relates motion estimation in video and film signalprocessing, in particular, to a technique for assessing the reliabilityof motion vectors.

DESCRIPTION OF THE RELATED ART

Gradient motion estimation is one of three or four fundamental motionestimation techniques and is well known in the literature (references 1to 18). More correctly called ‘constraint equation based motionestimation’ it is based on a partial differential equation which relatesthe spatial and temporal image gradients to motion.

Gradient motion estimation is based on the constraint equation relatingthe image gradients to motion. The constraint equation is a directconsequence of motion in an image. Given an object, ‘object(x, y)’,which moves with a velocity (u, v) then the resulting moving image, I(x,y, t) is defined by Equation 1;

I(x, y, t)=object(x−ut, y−vt)  Equation 1

This leads directly to the contraint equation, Equation 2;$\begin{matrix}{{{u \cdot \frac{{\partial I}\quad \left( {x,y,t} \right)}{\partial x}} + {v \cdot \frac{{\partial I}\quad \left( {x,y,t} \right)}{\partial y}} + \frac{{\partial I}\quad \left( {x,y,t} \right)}{\partial t}} = {\frac{{\partial{object}}\quad \left( {x,y} \right)}{\partial t} = 0}} & {{Equation}\quad 2}\end{matrix}$

where, provided the moving object does not change with time (perhaps dueto changing lighting or distortion) then ∂object/∂t=0. This equation is,perhaps, more easily understood by considering an example. Assume thatvertical motion is zero, the horizontal gradient is +2 grey levels perpixel and the temporal gradient is −10 grey levels per field. Then theconstraint equation says that the ratio of horizontal and temporalgradients implies a motion of 5 pixels/field. The relationship betweenspatial and temporal gradients is summarised by the constraint equation.

To use the constraint equation for motion estimation it is firstnecessary to estimate the image gradients; the spatial and temporalgradients of brightness. In principle these are easily calculated byapplying straightforward linear horizontal, vertical and temporalfilters to the image sequence. In practice, in the absence of additionalprocessing, this can only really be done for the horizontal gradient.For the vertical gradient, calculation of the brightness gradient isconfused by interlace which is typically used for television pictures;pseudo-interlaced signals from film do not suffer from this problem.Interlaced signals only contain alternate picture lines on each field.Effectively this is vertical sub-sampling resulting in vertical aliasingwhich confuses the vertical gradient estimate. Temporally the situationis even worse, if an object has moved by more than 1 pixel inconsecutive fields, pixels in the same spatial location may be totallyunrelated. This would render any temporal gradient estimate meaningless.This is why gradient motion estimation cannot, in general, measurevelocities greater than 1 pixel per field period (reference 8).

Prefiltering can be applied to the image sequence to avoid the problemof direct measurement of the image gradients. If spatial low passfiltering is applied to the sequence then the effective size of ‘pixels’is increased. The brightness gradients at a particular spatial locationare then related for a wider range of motion speeds. Hence spatial lowpass filtering allows higher velocities to be measured, the highestmeasurable velocity being determined by the degree of filtering applied.Vertical low pass filtering also alleviates the problem of verticalaliasing caused by interlace. Alias components in the image tend to bemore prevalent at higher frequencies. Hence, on average, low passfiltering disproportionately removes alias rather than true signalcomponents. The more vertical filtering that is applied the less is theeffect of aliasing. There are, however, some signals in which aliasingextends down to zero frequency. Filtering cannot remove all the aliasingfrom these signals which will therefore result in erroneous verticalgradient estimates and, therefore, incorrect estimates of the notionvector. One advantage of this invention is its ability to detecterroneous motion estimates due to vertical aliasing.

Prefiltering an image sequence results in blurring. Hence small detailsin the image become lost. This has two consequences, firstly thevelocity estimate becomes less accurate since there is less detail inthe picture and secondly small objects cannot be seen in the prefilteredsignal. To improve vector accuracy hierarchical techniques are sometimesused. This involves first calculating an initial, low accuracy, motionvector using heavy prefiltering, then refining this estimate to higheraccuracy using less prefiltering. This does, indeed, improve vectoraccuracy but it does not overcome the other disadvantage ofprefiltering, that is, that small objects cannot be seen in theprefiltered signal, hence their velocity cannot be measured. No amountof subsequent vector refinement, using hierarchical techniques, willrecover the motion of small objects if they are not measured in thefirst stage. Prefiltering is only advisable in gradient motionestimation when it is only intended to provide low accuracy motionvectors of large objects.

Once the image gradients have been estimated the constraint equation isused to calculate the corresponding motion vector. Each pixel in theimage gives rise to a separate linear equation relating the horizontaland vertical components of the motion vector and the image gradients.The image gradients for a single pixel do not provide enough informationto determine the motion vector for that pixel. The gradients for atleast two pixels are required. In order to minimise errors in estimatingthe motion vector it is better to use more than two pixels and find thevector which best fits the data from multiple pixels. Consider takinggradients front 3 pixels. Each pixel restricts the motion vector to aline in velocity space. With two pixels a single, unique, motion vectoris determined by the intersection of the 2 lines. With 3 pixels thereare 3 lines and, possibly, no unique solution. This is illustrated inFIG. 1. The vectors E₁ to E₃ are the error from the best fitting vectorto the constraint line for each pixel.

One way to calculate the best fit motion vector for a group ofneighbouring pixels is to use a least mean square method, that isminimising the sum of the squares of the lengths of the error vectors E₁to E₃ FIG. 1). The least mean square solution for a group ofneighbouring pixels is given by the solution of Equation 3;$\begin{matrix}{{{\begin{bmatrix}\sigma_{xx}^{2} & \sigma_{xy}^{2} \\\sigma_{xy}^{2} & \sigma_{yy}^{2}\end{bmatrix} \cdot \begin{bmatrix}u_{0} \\v_{0}\end{bmatrix}} = {- \begin{bmatrix}\sigma_{xt}^{2} \\\sigma_{yt}^{2}\end{bmatrix}}}{where}\quad \quad {{\sigma_{xx}^{2} = {\sum\quad {\frac{\partial I}{\partial x} \cdot \frac{\partial I}{\partial x}}}},\quad {\sigma_{xy}^{2} = {\sum{{\frac{\partial I}{\partial x} \cdot \frac{\partial I}{\partial y}}\quad {etc}}}}}} & {{Equation}\quad 3}\end{matrix}$

where (u₀, v₀) is the best fit motion vector and the summations are overa suitable region. This is an example of the well known technique oflinear regression analysis detailed, for example, in reference 19 andmany other texts. The (direct) solution of equation 3 is given byEquation 4; $\begin{matrix}{\begin{bmatrix}u_{0} \\v_{0}\end{bmatrix} = {\frac{1}{{\sigma_{{xx}\quad}^{2}\sigma_{{yy}\quad}^{2}} - \sigma_{{xy}\quad}^{4}}\begin{bmatrix}{{\sigma_{xy}^{2}\sigma_{yt}^{2}} - {\sigma_{yy}^{2}\sigma_{xt}^{2}}} \\{{\sigma_{xy}^{2}\sigma_{xt}^{2}} - {\sigma_{xx}^{2}\sigma_{yt}^{2}}}\end{bmatrix}}} & {{Equation}\quad 4}\end{matrix}$

Analysing small image regions produces detailed vector fields of lowaccuracy and vice versa for large regions. There is little point inchoosing a region which is smaller than the size of the prefilter sincethe pixels within such a small region are not independent.

Typically, motion estimators generate motion vectors on the samestandard as the input image sequence. For motion compensated standardsconverters, or other systems performing motion compensated temporalinterpolation, it is desirable to generate motion vectors on the outputimage sequence standard. For example when converting between Europeanand American television standards the input image sequence is 625 line50 Hz (interlaced) and the output standard is 525 line 60 Hz(interlaced). A motion compensated standards converter operating on aEuropean input is required to produce motion vectors on the Americanoutput television standard.

The direct implementation of gradient motion estimation, discussedherein in relation to FIGS. 2 and 3, can give wildly erroneous results.Such behaviour is extremely undesirable. These problems occur when thereis insufficient information in a region of an image to make an accuratevelocity estimate. This would typically arise when the analysis regioncontained no detail at all or only the edge of an object. In suchcircumstances it is either not possible to measure velocity or onlypossible to measure velocity normal to the edge. It is attempting toestimate the complete motion vector, when insufficient information isavailable, which causes problems. Numerically the problem is caused bythe 2 terms in the denominator of equation 4 becoming very similarresulting in a numerically unstable solution for equation 3.

A solution to this problem of gradient motion estimation has beensuggested by Martinez (references 11 and 12). The matrix in equation 3(henceforth denoted ‘M’) may be analysed in terms of its eigenvectorsand eigenvalues. There are 2 eigenvectors, one of which points parallelto the predominant edge in the analysis region and the other pointsnormal to that edge. Each eigenvector has an associated eigenvalue whichindicates how sharp the image is in the direction of the eigenvector.The eigenvectors and values are defined by Equation 5; $\begin{matrix}{{{M \cdot e_{t}} = {{\lambda_{t}e_{i}\quad i} \in \left\{ {1,\quad 2} \right\}}}{{\text{where;}\quad M} = \begin{bmatrix}\sigma_{xx}^{2} & \sigma_{xy}^{2} \\\sigma_{xy}^{2} & \sigma_{yy}^{2}\end{bmatrix}}} & {{Equation}\quad 5}\end{matrix}$

The eigenvectors e_(i) are conventionally defined as having length l,which convention is adhered to herein.

In plain areas of the image the eigenvectors have essentially randomdirection (there are no edges) and both eigenvalues are very small(there is no detail). In these circumstances the only sensible vector toassume is zero. In parts of the image which contain only an edge featurethe eigenvectors point normal to the edge and parallel to the edge. Theeigenvalue corresponding to the normal eigenvector is (relatively) largeand the other eigenvalue small. In this circumstance only the motionvector normal to the edge can be measured. In other circumstances, indetailed parts of the image where more information is available, themotion vector may be calculated using Equation 4.

The motion vector may be found, taking into account Martinez' ideasabove, by using Equation 6; $\begin{matrix}{\begin{bmatrix}u_{0} \\v_{0}\end{bmatrix} = {{- \left( {{\frac{\lambda_{1}}{\lambda_{1}^{2} + n_{1}^{2}}\quad e_{1}e_{1}^{t}} + {\frac{\lambda_{2}}{\lambda_{2}^{2} + n_{2}^{2}}\quad e_{2}e_{2}^{t}}} \right)} \cdot \begin{bmatrix}\sigma_{xt}^{2} \\\sigma_{yt}^{2}\end{bmatrix}}} & {{Equation}\quad 6}\end{matrix}$

where superscript t represents the transpose operation. Here n₁ & n₂ arethe computational or signal noise involved in calculating λE₁ & λ₂respectively. In practice n₁≈n₂, both being determined by, andapproximately equal to, the noise in the coefficients of M. When λ₁ &λ₂, <<n then the calculated motion vector is zero; as is appropriate fora plain region of the image. When λ1₁<<n and λ₂<<n then the calculatedmotion vector is normal to the predominant edge in that part of theimage. Finally if λ₁, λ₂>>n then equation 6 becomes equivalent toequation 4. As signal noise, and hence n, decreases then equation 6provides an increasingly more accurate estimate of the motion vectors aswould be expected intuitively.

In practice calculating motion vectors using the Martinez techniqueinvolves replacing the apparatus of FIG. 3, below, with more complexcircuitry. The direct solution of equation 6 would involve dauntingcomputational and hardware complexity. It can, however, be implementedusing only two-point, pre-calculated, look up tables and simplearithmetic operations.

A block diagram of a direct implementation of gradient motion estimationis shown in FIGS. 2 & 3.

The apparatus shown schematically in FIG. 2 performs filtering andcalculation of gradient products and their summations. The apparatus ofFIG. 3 generates motion vectors from the sums of gradient productsproduced by the apparatus of FIG. 2. The horizontal (10) and vertical(12) low pass filters in FIG. 2 perform spatial prefiltering asdiscussed above. The cut-off frequencies of {fraction (1/32)}nd bandhorizontally and {fraction (1/16)}th band vertically allow motion speedsup to (at least) 32 pixels per field to be measured. Different cut-offfrequencies could be used if a different range of speeds is required.The image gradients are calculated by three temporal and spatialdifferentiating filters (16,17,18).

The vertical/temporal interpolation filters (20) convert the imagegradients, measured on the input standard, to the output standard.Typically the vertical/temporal interpolators (20) are bilinearinterpolators or other polyphase linear interpolators. Thus the outputmotion vectors are also on the output standard. The interpolationfilters are a novel feature (subject of the applicant's co-pending UKPatent Application filed on identical date hereto) which facilitatesinterfacing the motion estimator to a motion compensated temporalinterpolator. Temporal low pass filtering is normally performed as partof (all 3 of) the interpolation filters. The temporal filter (14) hasbeen re-positioned in the processing path so that only one rather thanthree filters are required. Note that the filters prior (10,12,14) tothe multiplier array can be implemented in any order because they arelinear filters. The summation of gradient products, specified inequation 3, are implemented by the low pass filters (24) following themultiplier array. Typically these filters (24) would be (spatial)running average filters, which give equal weight to each tap withintheir region of support. Other lowpass filters could also be used at theexpense of more complex hardware. The size of these filters (24)determines the size of the neighbourhood used to calculate the bestfitting motion vector. Examples of filter coefficients which may be usedcan be found in the example.

A block diagram of apparatus capable of implementing equation 6 andwhich replaces that of FIG. 3, is shown in FIGS. 4 and 5.

Each of the ‘eigen analysis’ blocks (30), in FIG. 4, performs theanalysis for one of the two eigenvectors. The output of theeigen-analysis is a vector (with x and y components) equal tos_(i)=e_(i){square root over (λ_(i)+L /(λ_(i) ²+L +n²+L ))}. These ‘s’vectors are combined with vector (σ_(xt) ², σ_(yt) ²) (denoted c in FIG.4), according to equation 6, to give the motion vector according to theMartinez technique.

The eigen analysis, illustrated in FIG. 5, has been carefully structuredso that it can be implemented using lookup tables with no more than 2inputs. This has been done since lookup tables with 3 or more inputswould be impracticably large using today's technology. Theimplementation of FIG. 5 is based on first normalising the matrix M bydividing all its elements by (σ_(xx) ²+σ_(yy) ²). This yields a newmatrix, N, with the same eigenvectors (e₁ & e₂) and different (butrelated) eigenvalues (X₁ & X₂). The relationship between M, N and theireigenvectors and values is given by Equation 7. $\begin{matrix}{{N = {{\frac{1}{\sigma_{xx}^{2} + \sigma_{yy}^{2}}\quad M} = \begin{bmatrix}\frac{\sigma_{xx}^{2}}{\sigma_{xx}^{2} + \sigma_{yy}^{2}} & \frac{\sigma_{xy}^{2}}{\sigma_{xx}^{2} + \sigma_{yy}^{2}} \\\frac{\sigma_{xy}^{2}}{\sigma_{xx}^{2} + \sigma_{yy}^{2}} & \frac{\sigma_{yy}^{2}}{\sigma_{xx}^{2} + \sigma_{yy}^{2}}\end{bmatrix}}}{{M \cdot e_{i}} = {\lambda_{i} \cdot e_{i}}}{{N \cdot e_{i}} = {\chi_{i} \cdot e_{i}}}{\lambda_{i} = {\left( {\sigma_{xx}^{2} + \sigma_{yy}^{2}} \right)\quad \chi_{i}}}{n_{\lambda} = {\left( {\sigma_{xx}^{2} + \sigma_{yy}^{2}} \right)\quad n_{\chi}}}} & {{Equation}\quad 7}\end{matrix}$

Matrix N is simpler than H as it contains only two independent values,since the principle diagonal elements (N_(1,1), N_(2,2)) sum to unityand the minor diagonal elements (N_(1,2), N_(2,1)) are identical. Theprincipal diagonal elements may be coded as (σ_(xx) ²−σ_(yy) ²)/(σ_(xx)²+σ_(yy) ²) since Equation 8; $\begin{matrix}\begin{matrix}{N_{1,1} = {\frac{1}{2}\quad \left( {1 + \left( \frac{\sigma_{xx}^{2} - \sigma_{yy}^{2}}{\sigma_{xx}^{2} + \sigma_{yy}^{2}} \right)} \right)}} \\{N_{2,2} = {\frac{1}{2}\quad \left( {1 - \left( \frac{\sigma_{xx}^{2} - \sigma_{yy}^{2}}{\sigma_{xx}^{2} + \sigma_{yy}^{2}} \right)} \right)}}\end{matrix} & {{Equation}\quad 8}\end{matrix}$

Hence lookup tables 1 & 2 have all the information they require to findthe eigenvalues and vectors of N using standard techniques. It istherefore straightforward to precalculate the contents of these lookuptables. Lookup table 3 simply implements the square root function. Thekey features of the apparatus shown in FIG. 5 are that the eigenanalysisis performed on the normalised matrix, N, using 2 input lookup tables (1& 2) and the eigenvalue analysis (from table 2) is rescaled to thecorrect value using the output of table 3.

The gradient motion estimator described above is undesirably complex.The motion estimator is robust to images containing limited informationbut FIGS. 4 and 5 show the considerable complexity involved. Thesituation is made worse by the fact that many of the signals have a verywide dynamic range making the functional blocks illustrated much moredifficult to implement.

A technique which yields considerable simplifications withoutsacrificing performance based on normalising the basic constraintequation (equation 2) to control the dynamic range of the signals is thesubject of the applicant's co-pending application filed on identicaldate hereto. As well as reducing dynamic range this also makes othersimplifications possible.

Dividing the constraint equation by the modulus of the gradient vectoryields a normalised constraint equation i.e. Equation 9:$\begin{matrix}{{\frac{{u\quad \frac{\partial I}{\partial x}} + {v\quad \frac{\partial I}{\partial y}}}{{\bigtriangledown \quad I}} = {- \frac{\frac{\partial I}{\partial t}}{{\bigtriangledown \quad I}}}}{{{where}\text{:}\quad \bigtriangledown \quad I} = {{{\begin{bmatrix}\frac{\partial I}{\partial x} \\\frac{\partial I}{\partial y}\end{bmatrix}\quad\&}\quad {{\bigtriangledown \quad I}}} = \sqrt{\left( \frac{\partial I}{\partial x} \right)^{2} + \left( \frac{\partial I}{\partial y} \right)^{2}}}}} & {{Equation}\quad 9}\end{matrix}$

The significance of this normalisation step becomes more apparent ifequation 9 is rewritten as Equation 10; $\begin{matrix}{{{{{u \cdot \cos}\quad (\theta)} + {{v \cdot \sin}\quad (\theta)}} = {vn}}{{where}\text{:}}\quad {{{\cos \quad (\theta)} = \frac{\frac{\partial I}{\partial x}}{{\bigtriangledown \quad I}}},\quad {{\sin \quad (\theta)} = \frac{\frac{\partial I}{\partial y}}{{\bigtriangledown \quad I}}},\quad {{vn} = {- \frac{\frac{\partial I}{\partial t}}{{\bigtriangledown \quad I}}}}}} & {{Equation}\quad 10}\end{matrix}$

in which θ is the angle between the spatial image gradient vector (∇I)and the horizontal; vn is the motion speed in the direction of the imagegradient vector, that is, normal to the predominant edge in the pictureat that point. This seems a much more intuitive equation relating, as itdoes, the motion vector to the image gradient and the motion speed inthe direction of the image gradient. The coefficients of equation 10(cos(θ) & sin(θ)) have a well defined range (0 to 1) and, approximatelythe same dynamic range as the input signal (typically 8 bits). Similarlyvn has a maximum (sensible) value determined by the desired motionvector measurement range. Values of vn greater than the maximummeasurement range, which could result from either noise or ‘cuts’ in theinput picture sequence, can reasonably be clipped to the maximumsensible motion speed.

The normalised constraint equation 10 can be solved to find the motionvector in the same way as the unnormalised constraint equation 2. Withnormalisation, equation 3 becomes Equation 11; $\begin{matrix}{{\begin{bmatrix}{\sum\quad {\cos^{2}\quad (\theta)}} & {\sum\quad {\cos \quad {(\theta) \cdot \sin}\quad (\theta)}} \\{\sum\quad {\cos \quad {(\theta) \cdot \sin}\quad (\theta)}} & {\sum\quad {\sin^{2}\quad (\theta)}}\end{bmatrix} \cdot \begin{bmatrix}u_{0} \\v_{0}\end{bmatrix}} = {{{\begin{bmatrix}{\sum\quad {{{vn} \cdot \cos}\quad (\theta)}} \\{\sum\quad {{{vn} \cdot \sin}\quad (\theta)}}\end{bmatrix}\quad {or}\text{:}\quad {\Phi \cdot \begin{bmatrix}u_{0} \\v_{0}\end{bmatrix}}} = \psi}}} & {{Equation}\quad 11}\end{matrix}$

In fact matrix (φ) has only 2 independent elements, sincecos²(x)+sin²(x)=l. This is more clearly seen by rewriting cos²(x) andsin²(x) as ½(l±cos(2x)) hence equation 11 becomes Equation 12$\begin{matrix}{{\frac{1}{2} \cdot \left( {{N \cdot I} + \begin{bmatrix}{\sum\quad {\cos \quad \left( {2\quad \theta} \right)}} & {\sum\quad {\sin \quad \left( {2\quad \theta} \right)}} \\{\sum\quad {\sin \quad \left( {2\quad \theta} \right)}} & {- {\sum\quad {\cos \quad \left( {2\quad \theta} \right)}}}\end{bmatrix}} \right) \cdot \begin{bmatrix}u_{0} \\v_{0}\end{bmatrix}} = {\begin{bmatrix}{\sum\quad {{{vn} \cdot \cos}\quad (\theta)}} \\{\sum\quad {{{vn} \cdot \sin}\quad (\theta)}}\end{bmatrix}\quad}} & {{Equation}\quad 12}\end{matrix}$

where I is the (2×2) identity matrix and N is the number of pixelsincluded in the summations. Again the motion vector can be found usingequation 13: $\begin{matrix}{\begin{bmatrix}u_{0} \\v_{0}\end{bmatrix} = {\left( {{\frac{\lambda_{1}}{\lambda_{1}^{2} + n_{1}^{2}}\quad e_{1}e_{1}^{t}} + {\frac{\lambda_{2}}{\lambda_{2}^{2} + n_{2}^{2}}\quad e_{2}e_{2}^{t}}} \right) \cdot {\begin{bmatrix}{\sum\quad {{{vn} \cdot \cos}\quad (\theta)}} \\{\sum\quad {{{vn} \cdot \sin}\quad (\theta)}}\end{bmatrix}}}} & {{Equation}\quad 13}\end{matrix}$

where now e and λ are the eigenvectors and eigenvalues of φ rather thanM. Now, because φ only has two independent elements, the eigen-analysiscan now be performed using only three, two-point, lookup tables,furthermore the dynamic range of the elements of φ (equation 11) is muchless than the elements of M thereby greatly simplifying the hardwarecomplexity.

A block diagram of a gradient motion estimator using Martinez techniqueand based on the normalised constraint equation is shown in FIGS. 6 & 7.

The apparatus of FIG. 6 performs the calculation of the normalisedconstraint equation (equation 10) for each pixel or data value.Obviously, if prefiltering is performed the number of independent pixelvalues is reduced, the effective pixel size is greater. The filtering inFIG. 6 is identical to that in FIG. 2. The spatial image gradientsconverted to the output standard are used as inputs for a rectangular topolar co-ordinate converter (32) which calculates the magnitude of thespatial image vector and the angle θ. A suitable converter can beobtained from Raytheon (Co-ordinate transformer, model TMC 2330). Alookup table (34) is used to avoid division by very small numbers whenthere is no detail in a region of the input image. The constant term,‘n’, used in the lookup table is the measurement noise in estimating|∇I| which depends on the input signal t noise ratio and theprefiltering used. A limiter (36) has also been introduced to restrictthe normal velocity, vn, to its expected range (determined by thespatial prefilter) The normal velocity might, otherwise, exceed itsexpected range when the constraint equation is violated, for example atpicture cuts. A key feature of FIG. 6 is that, due to the normalisation,that has been performed, the two outputs, vn & θ, have a much smallerdynamic range than the three image gradients in FIG. 2, thereby allowinga reduction in the hardware complexity.

In the apparatus of FIG. 6 the input video is first filtered usingseparate temporal, vertical and horizontal filters (10,12,14), the imagegradients are calculated using three differentiating filters (16,18) andthen converted, from the input lattice, to the output sampling latticeusing three vertical/temporal interpolators (20), typically bilinear orother polyphase linear filters. For example, with a 625/50/2:1 input theimage gradients night be calculated on a 525/60/2:1 lattice. Theparameters of the normalised constraint equation, vn & θ, are calculatedas shown.

The apparatus of FIG. 7 calculates the best fitting motion vector,corresponding to a region of the input image, from the constraintequations for the pixels in that region. The summations specified inequation 12 are implemented by the lowpass filters (38) following thepolar to rectangular co-ordinate converter (40) and lookup tables 5 & 6.Typically these filters (38) would be (spatial) running average filters,which give equal weight to each tap within their region of support.Other lowpass filters could also be used at the expense of more complexhardware. The size of these filters (38) determine the size of theneighbourhood used to calculate the best fitting motion vector. Lookuptables 5 & 6 are simply cosine and sine lookup tables. Lookup tables 7to 9 contain precalculated values of matrix ‘Z’ defined by Equation 14;$\begin{matrix}{Z = \left( {{\frac{\lambda_{1}}{\lambda_{1}^{2} + n_{1}^{2}}\quad e_{1}e_{1}^{t}} + {\frac{\lambda_{2}}{\lambda_{2}^{2} + n_{2}^{2}}\quad e_{2}e_{2}^{t}}} \right)} & {{Equation}\quad 14}\end{matrix}$

where e and λ are the eigenvectors and eigenvalues of φ. Alternatively Zcould be φ⁻¹ (i.e. assuming no noise), but this would not apply theMartinez technique and would give inferior results. A key feature ofFIG. 7 is that the elements of matrix Z are derived using 2 input lookuptables. Their inputs are the output from the two lowpass filters (39)which have a small dynamic range allowing the use of small lookuptables.

The implementations of the gradient motion techniques discussed aboveseek to find the ‘best’ motion vector for a region of the input picture.However it is only appropriate to use this motion vector, for motioncompensated processing, if it is reasonably accurate. Whilst thedetermined motion vector is the ‘best fit’ this does not necessarilyimply that it is also an accurate vector. The use of inaccurate motionvectors, in performing motion compensated temporal interpolation,results in objectionable impairments to the interpolated image. To avoidthese impairments it is desirable to revert to a non-motion compensatedinterpolation algorithm when the motion vector cannot be measuredaccurately. To do this it is necessary to know the accuracy of theestimated motion vectors. If a measure of vector accuracy is availablethen the interpolation method can be varied between ‘full motioncompensation’ and no motion compensation depending on vector accuracy, atechnique known as ‘graceful fallback’ described in reference 16.

It has been suggested (reference 16) to provide an indication of motionvector reliability in phase correlation systems determined from therelative height of the correlation peaks produced. In block matchingsystems, an error indication is given by the quality of the matchbetween picture blocks. Neither of these options measures the actualerror of the motion vectors but merely provide an indication thereof. Inthe latter case the “confidence” in the motion vectors is given by adifference in grey levels between the blocks and is not, therefore,necessarily related to the motion vector error.

SUMMARY OF THE INVENTION

It is an object of the present invention to provide a technique fordetermining the accuracy of motion vectors. This method is based on theuse of the constraint equation and hence is particularly suitable foruse with gradient based motion estimation techniques as described above.The method, however, is more general than this and could also be used toestimate the accuracy of motion vectors measured in other ways, forexample, using a block matching technique. The measurement of theaccuracy of motion vectors is a new technique. Most of the literature onmotion estimation concentrates almost wholly on ways of determining the‘best’ motion vector and pays scant regard to considering whether theresulting motion vectors are actually accurate. This may, in part,explain why motion compensated processing is, typically, unreliable forcertain types of input image.

The invention provides video or film signal processing apparatuscomprising motion estimation apparatus for generating motion vectorseach corresponding to a region of an input video signal, means forcalculating for each of said regions a plurality of spatial and temporalimage gradients, and means for calculating for each motion vector aplurality of error values corresponding to said plurality of imagegradients, the apparatus having as an output for each motion vector acorresponding indication of the motion vector measurement error derivedfrom said plurality of error values.

The motion estimation apparatus preferably includes said means forcalculating the image gradients.

The motion estimation apparatus preferably calculates the motion vectorsfrom the normalised constraint equation of a plurality of imagegradients and generates a corresponding plurality of outputs each equalto the angle (θ) corresponding to the orientation of the spatial imagegradient vector and the speed (vn) in the direction of the spatial imagegradient vector.

The means for calculating a plurality of error values includes sine andcosine lookup tables having the values of θ as an input and anarithmetic having as inputs, each motion vector, a correspondingplurality of values of vn and the sines and cosines of θ.

The apparatus may comprise multiplier means for generating a pluralityof error vectors and having said error values and the correspondingvalues of sin θ and cos θ as inputs.

The apparatus preferably comprises means for generating at least oneparameter giving an indication of the extent of the distribution ofmotion vector measurement errors.

The invention also provides a method of processing video or film signalscomprising generating motion vectors each corresponding to a region ofan input signal, for each region calculating a plurality of spatial andtemporal image gradients, calculating a plurality of error valuescorresponding to said plurality of image gradients, and generating foreach motion vector a corresponding indication of the motion vectormeasurement error derived from said plurality of error values.

The motion vectors may be generated based on the constraint equationscorresponding to said plurality of image gradients.

The method may comprise calculating for each plurality of imagegradients corresponding to each of said regions, an angle (θ)corresponding to the orientation of the spatial image gradient vectorand the motion speed (vn) in the direction of said spatial imagegradient vector.

The method preferably comprises calculating a plurality of error vectorsfrom said error values.

The indication of motion vector measurement error may be in the form ofat least one parameter indicating, the extent of the distribution ofmotion vector measurement errors.

In an embodiment the said at least one parameter includes a scalarmotion vector error signal. In a further embodiment the said at leastone parameter includes four values representing the spread in motionvector measurement error. These four values may be comprised of two,two-component, vectors.

BRIEF DESCRIPTION OF THE DRAWINGS

The invention will now be described in more detail with reference to theaccompanying drawings in which:

FIG. 1 shows graphically the image gradient constraint lines for threepixels.

FIGS. 2 and 3 are a block diagram of a motion estimator.

FIGS. 4 is a block diagram of apparatus for calculating motion vectorswhich can be substituted for the apparatus of FIG. 3.

FIG. 5 is a block diagram of apparatus for implementing the eigenanalysis required in FIG. 4.

FIGS. 6 and 7 show another example of a motion estimation apparatus.

FIG. 8 slows graphically the distribution of errors in the case of abest fit motion vector.

FIG. 9 is a block diagram of apparatus for calculating the elements ofan error matrix.

FIG. 10 is a block diagram for calculating a scalar error factor.

FIG. 11 is a block diagram for calculating the elements of a covariancematrix.

FIG. 12 is an apparatus according to the invention for generating errorvalues in the form of spread vectors and a scalar measurement of theerror.

FIG. 13 is another embodiment of apparatus according to the inventionwhich can be substituted for the apparatus of FIGS. 11 and 12.

DESCRIPTION OF THE PREFERRED EMBODIMENTS

Once a motion vector has been estimated for a region of an image anerror may be calculated for each pixel within that region. That error isan indication of how accurately the motion vector satisfies theconstraint equation or the normalised constraint equation (equations 2and 10 above respectively). The following discussion will use thenormalised constraint equation as this seems a more objective choice butthe unnormalised constraint equation could also be used with minorchanges (the use of the unnormalised constraint equation amounts togiving greater prominence to pixels with larger image gradients). Forthe ith pixel within the analysis region the error is given by Equation15;

error_(i) =vn _(i) −u ₀cos(θ_(i))−v ₀sin(θ_(i)) ∀l≦i≦N  Equation 15

(for all i when l≦i≦N, where N is the number of pixels in the analysisregion).

This error corresponds to the distance of the ‘best’ motion vector, (u₀,v₀), from the constraint line for that pixel (see FIG. 1). Note thatequation 11 above gives a motion vector which minimises the sum of thesquares of these errors. Each error value is associated with thedirection of the image gradient for that pixel. Hence the errors arebetter described as an error vector, E_(i), illustrated in FIG. 1 anddefined by Equation 16;

E _(i) ¹=error_(i), [cos(θ), sin(θ)]  Equation 16

where superscript t represents the transpose operation.

The set of error vectors, {E_(i)}, form a two dimensional distributionof errors in motion vector space, illustrated in FIG. 8. Thisdistribution of motion vector measurement errors would be expected to bea two dimensional. Gaussian (or Normal) distribution. Conceptually thedistribution occupies an elliptical region around the true motionvector. The ellipse defines the area in which most of the estimates ofthe motion vector would lie; the ‘best’ motion vector points to thecentre of the ellipse. FIG. 8 illustrates the ‘best’ motion vector, (u₀,v₀), and 4 typical error vectors, E₁ to E₄. The distribution of motionvector measurement errors is characterised by the orientation and lengthof the major and minor axes (σ₁, σ₂) of the ellipse. To calculate thecharacteristics of this distribution we must first form the (N×2) matrixdefined as Equation 17; $\begin{matrix}{E = {\begin{bmatrix}E_{1}^{t} \\E_{2}^{t} \\\vdots \\E_{N}^{t}\end{bmatrix} = \begin{bmatrix}{{{error}_{1} \cdot \cos}\quad \left( \theta_{1} \right)} & {{{error}_{1} \cdot \sin}\quad \left( \theta_{1} \right)} \\{{{error}_{2} \cdot \cos}\quad \left( \theta_{2} \right)} & {{{error}_{2} \cdot \sin}\quad \left( \theta_{2} \right)} \\\quad & \quad \\{{{error}_{N} \cdot \cos}\quad \left( \theta_{N} \right)} & {{{error}_{N} \cdot \sin}\quad \left( \theta_{N} \right)}\end{bmatrix}}} & {{Equation}\quad 17}\end{matrix}$

The length and orientation of the axes of the error distribution aregiven by eigenvector analysis of E^(t)·E; the eigenvectors point alongthe axes of the distribution and the eigenvalues, N.σ₁ ² & N.σ₂ ² (whereN is the total number of pixels in the region used to estimate theerrors), give their length (see FIG. 8) that is Equation 18;

Q·c _(i)=σ_(i) ² ·c _(i)  Equation 18

where i=1 or 2; and Q=1/N·(E′·E)

The matrix (Et·E)/N (henceforth the ‘error matrix’ and denoted Q forbrevity) can be expanded to give Equation 19; $\begin{matrix}{Q = {\left\lbrack \quad \left. \begin{matrix}{\frac{1}{N}\quad {\sum\quad {{{error}^{2} \cdot \cos^{2}}\quad (\theta)}}} & {\frac{1}{N}\quad {\sum\quad {{{error}^{2} \cdot \cos}\quad {(\theta) \cdot \sin}\quad (\theta)}}} \\{\frac{1}{N}\quad {\sum\quad {{{error}^{2} \cdot \cos}\quad {(\theta) \cdot \sin}\quad (\theta)}}} & {\frac{1}{N}\quad {\sum\quad {{{error}^{2} \cdot \sin^{2}}\quad (\theta)}}}\end{matrix}\quad \right\rbrack \right.}} & {{Equation}\quad 19}\end{matrix}$

where the summation is over a region of the image containing N pixels.

To calculate the distribution of motion vector measurement errors it isnecessary to first calculate the elements of the error matrix, accordingto equation 19, then calculate its eigenvectors and eigenvalues. Theelements of the error matrix may be calculated by the apparatus of FIG.9. Other implementations are possible, but FIG. 9 is straight forwardand efficient. The inputs to FIG. 9, θ and vn, may be derived as in FIG.6. The motion vector input to FIG. 9, (u, v), could be derived as inFIG. 7, however it could equally well come from any other source such asFIG. 3 or 4 or even a block matching motion estimator. The lookup tables(10 and 11) are simply cosine and sine tables and, as in FIGS. 2 & 7,the required summations are performed using spatial lowpass filters (42)such as running average filters.

Although the error matrix, according to equation 19, can give a goodindication of the vector error, for some types of picture it may bemisleading. Misleading results, using the error matrix, may occur inparts of the picture which contain predominantly an edge feature. Withthis type of picture the error matrix gives an underestimate of thevector error parallel to the edge. That is the error matrix is a biasedmeasure of the vector error under these circumstances. The, reason forthis bias can be understood by considering a set of nearly parallelconstraint lines (as specified in equations 2, 9 or 10 and illustratedin FIG. 1). With nearly parallel constraint lines the error vectors(defined in equation 16) will be nearly perpendicular to the constraintlines and hence perpendicular to the edge feature in the image. In thesecircumstance the major error in the estimate of the motion vector willbe parallel to the edge. However the error vectors will have a smallcomponent in this direction, hence underestimating the true error inthis direction.

An alternative measure of the error, in the motion vector can be derivedusing the techniques of linear regression analysis (described inreference 19 and elsewhere). In regression analysis it is assumed that arandom (zero mean) error term, with known standard deviation is added toeach constraint equation. Knowing the error added to each constraintequation the techniques of linear algebra can be applied to calculatethe cumulative effect of the errors, in all the constraint equations, onthe final motion vector estimate. Of course we do not know, a priori,the standard deviation of the error in the constraint equations. Howeverthis can be estimated once the best fitting motion vector has beenestimated. Measuring the error in the motion vector, using thistechnique, is thus a three stage process. First estimate the bestfitting motion vector. Then estimate the standard deviation of the errorin the constraint equations. Then use this standard deviation toestimate the error in the best fitting motion vector.

The result of analysing the error in the motion vector using regressionanalysis are summarised in equation 20 $\begin{matrix}{{{Cov} = {\begin{bmatrix}{Cov}_{1,1} & {Cov}_{1,2} \\{Cov}_{2,1} & {Cov}_{2,2}\end{bmatrix} = {\frac{1}{N - 2}\quad {\left( {{v_{n}^{\prime} \cdot v_{n}} - {{v_{0}^{\prime} \cdot \vartheta^{\prime}}v_{n}}} \right) \cdot \left( {\vartheta^{\prime}\vartheta} \right)^{- 1}}}}}\quad {{{where};\quad {v_{a} = {{\begin{bmatrix}{vn}_{1} \\{vn}_{2} \\\vdots \\{vn}_{N}\end{bmatrix} \cdot v_{0}} = \begin{bmatrix}u_{0} \\v_{0}\end{bmatrix}}}},{\vartheta = \begin{bmatrix}{\cos \quad \theta_{1}} & {\sin \quad \theta_{1}} \\{\cos \quad \theta_{2}} & {\sin \quad \theta_{2}} \\\vdots & \vdots \\{\cos \quad \theta_{N}} & {\sin \quad \theta_{N}}\end{bmatrix}}}} & {{Equation}\quad 20}\end{matrix}$

Here Cov is a (statistically unbiased) estimate of the autocovariancematrix for the measured motion vector, the other elements of theequation having been defined previously, vector v₀=(u₀, v₀)^(t) beingthe best fitting motion vector. Derivation of this equation is describedin reference 19 and many other texts. A covariance matrix is a wellknown multidimensional analogue of the variance of a 1 dimensionalrandom variable. Equation 20 has a scalar and a matrix factor whichexpand as;

S ² =v _(n) ^(t) ·v _(n) −v ₀ ^(t)θ^(t) v _(n) Σvn ² −u ₀ Σvn·cos(θ)−v ₀Σvn·sinθ

$\begin{matrix}{\left( {\vartheta^{\prime}\vartheta} \right)^{- 1} = \begin{bmatrix}{\sum\quad {\cos^{2}\quad (\theta)}} & {\sum\quad {\cos \quad {(\theta) \cdot \sin}\quad (\theta)}} \\{\sum\quad {\cos \quad {(\theta) \cdot \sin}\quad (\theta)}} & {\sum\quad {\sin^{2}\quad (\theta)}}\end{bmatrix}^{- 1}} & {{Equation}\quad 21}\end{matrix}$

Here S, the scalar error factor, is equivalent to ‘error’, defined inequation 15, and the covariance matrix Cov is equivalent to errormatrix, Q=(E^(t)·E)/N, defined in equation 19.

Although equation 21 nis seemingly complicated the covariance matrix Covis easily derived from intermediate results already calculated toestimate the motion vector. The scalar error factor, S, can becalculated by the apparatus of FIG. 10, whilst θt·θ (as described inequation 8) has already been calculated to estimate the motion vector.Note the inputs to FIG. 10 have been generated as shown in FIG. 6 or 7;Σvn·cos(θ) and Σvn·sin(θ) being taken after the spatial interpolators ifthese are included in the system. Once the scalar error factor S hasbeen generated the complete covariance matrix, Cov, may be calculated bythe apparatus of FIG. 11. The lookup tables in FIG. 11 each calculateone of the 3 different components of the matrix inverse of θ^(t)·θ. Thetwo inputs to these lookup tables completely specify θ^(t)·θ asdescribed in equations 11 and 12, hence the content of these lookuptables may easily be precalculated.

The error matrix, Q, or the covariance matrix, Cov, are alternativegeneral descriptions of the error distribution in the measurement of themotion vector. The vector error distribution is described by a matrixbecause the motion vector is, obviously, a vector rather than a scalarquantity. The covariance matrix is the multidimensional analogue of thevariance of a scalar quantity. Matrices Q and Cov are simply differentdescriptions of the error distribution. For a scalar variable there arealso alternative measures of the error such as the standard deviation(root mean square error) or the mean absolute error.

Although the error or covariance matrix contains all the informationabout the error distribution it is sometimes convenient to derivealternative descriptions of the distribution. One convenientrepresentation involves analysing the error or covariance matrix interms of its eigenvectors and eigenvalues. The error distribution may bethought of as an elliptical region round the motion vector (FIG. 8). Theeigenvectors describe the orientation of the principle axes of theellipse and the eigenvalues their radii. The eigenvalues are thevariance, in the direction of their corresponding eigenvector.

Once the error or covariance matrix has been calculated (e.g. as in FIG.9 or 10 and 11) its eigenvalues and eigenvectors may be found using theimplementation of FIG. 12 whose inputs are the elements of the error orcovariance matrix, i.e. Σ(error²·cos²(θ)), Σ(error²·cos(θ)·sin(θ)) andΣ(error²·sin²(θ)) or Cov_(1,1), Cov_(1,2) and Cov_(2.2), denoted Q₁₁,Q₁₂ and Q₂₂ respectively. Note that, as in FIG. 5, since there are twoeigenvalues the implementation of FIG. 12 must be duplicated to generateboth eigenvectors. As in FIG. 5, described previously, theimplementation of FIG. 12 has been carefully structured so that it useslook up tables with no more than 2 inputs. In FIG. 12 the output oflookup table 15 is the angular orientation of an eigenvector, that isthe orientation of one of the principle axes of the (2 dimensional)error distribution. The output of lookup table 16, once it has beenrescaled by the output of lookup table 17, is proportional to the squareroot of the corresponding eigenvalue. An alternative function of theeigenvalue may be used depending on the application of the motion vectorerror information.

The spread vector outputs of FIG. 12 (i.e. (Sx_(i), Sy_(i)) i=1, 2)describe the likely motion vector measurement error for each motionvector in two dimensions. Since a video motion vector is a (2dimensional) vector quantity, two vectors are required to describe themeasurement error. In this implementation the spread vectors point alongthe principle axes of the distribution of vector measurement errors andtheir magnitude is the standard deviation of measurement error alongthese axes. If we assume, for example, that the measurement errors aredistributed as a 2 dimensional Gaussian distribution, then theprobability distribution of the motion vector, v, in given by equation22; $\begin{matrix}{{{P(v)} = {\frac{1}{2 \cdot \pi \cdot {s_{1}} \cdot {s_{2}}}{\exp \left( {- \left( {\left( {\left( {v - v_{m}} \right) \cdot \frac{s_{1}}{2 \cdot {s_{1}}^{2}}} \right)^{2} + \left( {\left( {v - v_{m}} \right) \cdot \frac{s_{2}}{2 \cdot {s_{2}}^{2}}} \right)^{2}} \right)} \right)}}}} & {{Equation}\quad 22}\end{matrix}$

where v_(m) is the measured motion vector and s₁ and s₂ are the twospread vectors. Of course, the motion vector measurement errors may nothave a Gaussian distribution but the spread vectors, defined above,still provide a useful measure of the error distribution. For someapplications it may be more convenient to define spread vectors whosemagnitude is a different function of the error matrix eigenvalues.

An alternative, simplified, output of FIG. 12 is a scalar confidencesignal rather than the spread vectors. This may be more convenient forsome applications. Such a signal may be derived from, r_(error), theproduct of the outputs of lookup tables 17 and 18 in FIG. 12, whichprovides a scalar indication of the motion vector measurement error. Thescalar error is the geometric mean of the standard deviation along theprinciple axes of the error distribution. That is it is the ‘radius’ ofa circular, i.e. isotropic error distribution with the same area as the(anisotropic)elliptical distribution.

The confidence signal may then be used to implement graceful fallback ina motion compensated image interpolator as described in reference 4. Forexample the motion vector may be scaled by the confidence signal so thatit remains unchanged for small motion vector errors but tends to zerofor large errors as the confidence decreases to zero. The r_(error)signal is a scalar, average, measure of motion vector error. It assumesthat the error distribution is isotropic and, whilst this may not bejustified in some situations, it allows a simple confidence measure tobe generated. Note that the scalar vector error, r_(error), is anobjective function, of the video signal, whilst the derived confidencesignal is an interpretation of it.

A confidence signal may be generated by assuming that there is a smallrange of vectors which shall be treated as correct. This predefinedrange of correct vectors will depend on the application. We may, forexample, define motion vectors to be correct if they are within, say,10% of the true motion vector. Outside the range of correct vectors weshall have decreasing confidence in the motion vector. The range ofcorrect motion vectors is the confidence region specified byr_(confident) which might, typically, be defined according to equation23;

r _(confident) ={square root over (k².|v|²+r₀ ²+L )}  Equation 23

where k is a small fraction (typically (10%) and r₀ is small constant(typically 1 pixel/field) and |v| is the measured motion speed. Theparameters k and r₀ can be adjusted during testing to achieve bestresults. Hence the region of confidence is proportional to the measuredmotion speed accept at low speeds when it is a small constant. Theconfidence value is then calculated, for each output motion vector, asthe probability that the actual velocity is within the confidenceradius, r_(confident), of the measured velocity. This may be determinedby assuming a Gaussian probability distribution:${confidence} = {\int_{0}^{r_{confident}}{2\quad \pi \quad {r \cdot \exp}\quad \left( {{- \frac{1}{2}}\frac{r^{2}}{r_{error}^{2}}} \right)\quad {{r}/{\int_{0}^{\infty}{2\quad \pi \quad {r \cdot \exp}\quad \left( {{- \frac{1}{2}}\frac{r^{2}}{r_{error}^{2}}} \right)\quad {r}}}}}}$

giving the following expression for vector confidence (equation 24):$\begin{matrix}{{confidence} = {1 - {\exp \quad \left( {{- \frac{1}{2}}\frac{r_{confident}^{2}}{r_{error}^{2}}} \right)}}} & {{Equation}\quad 24}\end{matrix}$

An embodiment of apparatus for estimating vector error is shown in FIGS.6, 7, 9 and 12, or in FIGS. 6, 7, 10, 11 and 12. The apparatus of FIG. 9calculates the error matrix using the outputs from the apparatus of FIG.6, which were generated previously to estimate the motion vector.Alternatively the apparatus of FIGS. 10 and 11 calculates the covariancematrix using output from the apparatus of FIGS. 6 and 7, which weregenerated previously to estimate the motion vector. The error matrix,(E^(t).E)/N, or covariance matrix, Cov, input in FIG. 12 is denoted Q tosimplify the labelling. The content of lookup tables in FIG. 12 aredefined by:${{Look}\quad {Up}\quad {Table}\quad {\# 15}} = {\arctan \quad \left( \frac{{\pm \sqrt{x^{2} + {4\quad y^{2}}}} - x}{2\quad y} \right)}$${{Look}\quad {Up}\quad {Table}\quad {\# 16}} = \sqrt{\frac{z}{2} \cdot \left( {1 \mp \sqrt{x^{2} + y^{2}}} \right)}$${{Look}\quad {Up}\quad {Table}\quad {\# 17}} = \sqrt{z}$${{Look}\quad {Up}\quad {Table}\quad {\# 18}} = \sqrt[4]{{\frac{1}{4}\quad \left( {1 - x^{2}} \right)} - y^{2}}$${{{where}\text{;}\quad x} = \frac{Q_{1,1} - Q_{2,2}}{Q_{1,1} + Q_{2,2}}};\quad {y = {{{\frac{Q_{1,2}}{Q_{1,1} + Q_{2,2}}\quad\&}\quad z} = {Q_{1,1} + Q_{2,2}}}}$${{Look}\quad {Up}\quad {Table}\quad {\# 19}} = {1 - {\exp \quad \left( {{- \frac{1}{2}} \cdot \frac{r_{confident}^{2}}{r_{error}^{2}}} \right)}}$

where; r_(confident)={square root over (k²+L (u²+L +v²+L )+r₀ ²+L )}

where the positive sign is taken for one of the eigen analysis units andthe negative sign is taken for the other unit.

The input of lookup table 17 in FIG. 12 (Q₁₁+Q₁₂) is a dimensionedparameter (z) which describes the scale of the distribution of motionvector errors. The content of lookup table 17 is defined by z. Theoutput of Lookup table 17 is a scaling factor which can be used to scalethe output of lookup table 16 defined above. The input to the polar torectangular co-ordinate converter is, therefore, related to the lengthof each principle axis of the error distribution. Using a differentlookup table it would be possible to calculate the spread vectorsdirectly in Cartesian coordinates.

The apparatus described in relation to FIG. 12, is capable of producingboth the spread vectors and the scalar confidence signal. The presentinvention also encompasses methods and apparatus which generate only onesuch parameter; either the confidence signal or the spread vectors. Theeigen analyses performed by the apparatus of FIG. 12 must be performedtwice to give both spread vectors for each principle axis of the errordistribution; only one implementation of FIG. 12 is required to generater_(error) and the derived confidence signal. The inputs to lookup table18 are the same as for lookup table 15 (x and y). The content of Lookuptable 18 is defined by ⁴(¼(1−x²)−y²). The output of lookup table 18scaled by the output of lookup table 17 gives r_(error) a scalar(isotropic) vector error from which a confidence signal is generated inlookup table 19, the contents of which are defined by equation 24, forexample, r_(error) is the geometric mean of the length of the major andminor axes of the error distribution, that is, r_(error)=(σ₁, σ₂).

An alternative embodiment of apparatus for estimating motion vectorerror is shown in FIGS. 6, 7, 10 and 13. This embodiment may be used ifthe error is estimated using the covariance matrix but not using theerror matrix. A key feature of this embodiment is that many functions ofthe covariance matrix may be generated using only a 2 input lookup tableand multiplier as shown in FIG. 13. The apparatus of FIG. 13 calculatesthe spread vectors and r_(error) using intermediate signals from FIG. 7,Σcos(2θ) and Σsin(2θ) (taken after the spatial interpolators if theseare included in the system), which were generated previously to estimatethe motion vector, and the scalar error factor, S, which is the outputof FIG. 10.

The top 4 lookup tables of FIG. 13 each generate a component of one ofthe 2 vectors defined in equation 25.

vector _(i) =λ _(i) .e _(i)   Equation 25

where; i=1, 2 and (θ′ θ)⁻¹.e_(i)=λ_(i).e_(i)

Since the inputs to the lookup tables completely define θ^(t). θ (asnoted above) it is straight forward to precalculate the content of theselookup tables. Multiplied by the scalar error factor, S, the vectorcomponents defined in equation 25 give the components of the two spreadvectors defined above (identical to the spread vector outputs of FIG.12). Hence the outputs of the top 4 multipliers each produce onecomponent (horizontal or vertical) of one of the two spread vectors(defined above).

Lookup table 24 and the bottom multiplier of FIG. 13 generate r_(error)(identical to r_(error) of FIG. 12) which is then combined with themotion speed in lookup table 25 to give the confidence signal (identicalto that in FIG. 12). Lookup table 24 generates the square root of thedeterminant of (θ^(t). θ)⁻¹ which when multiplied by S gives r_(error).Mathematically, using the same notation as equation 25, the output oflookup table 24 is given by equation 26.

Look up table 24=|(θ′.θ)⁻¹|={square root over (λ₁+L .λ₂+L )}  Equation26

Since the inputs to lookup table 24 completely define θ_(t). θ it isstraight forward to precalculate the content of this lookup table.Lookup table 25 in FIG. 13 has exactly the same function and content aslookup table 19 in FIG. 12.

In FIGS. 7, 9 and 10, picture resizing is allowed for using (intrafield)spatial interpolators (44) following the region averaging filters(38,39,42). Picture resizing is optional and is required for example foroverscan and aspect ratio conversion. The apparatus of FIG. 6 generatesits outputs on the nominal output standard, that is assuming no pictureresizing. The conversion from input to (nominal) output standard isachieved using (bilinear) vertical/temporal interpolators(20).Superficially it might appear that these interpolators (20) could alsoperform the picture stretching or shrinking required for resizing.However, if this were done the region averaging filters (38,42) in FIGS.7, 9 and 10 would have to vary in size with the resizing factor. Thiswould be very awkward for large picture expansions as very large regionaveraging filters (38,42) would be required. Picture resizing istherefore achieved after the region averaging filters using purelyspatial (intrafield) interpolators (44), for example bilinearinterpolators. In fact the function of the vertical/temporal filters(20) in FIG. 6 is, primarily, to interpolate to the output field rate.The only reason they also change the line rate is to maintain a constantdata rate.

EXPERIMENTAL RESULTS

Experiments were performed to simulate the basic motion estimationalgorithm (FIGS. 2 & 3), use of the normalised constraint equation(FIGS. 6 & 7), the Martinez technique with the normalised constraintequation and estimation of vector measurement error (FIGS. 9 & 5). Ingeneral these experiments confirmed the theory and techniques describedabove.

Simulations were performed using a synthetic panning sequence. This wasdone both for convenience and because it allowed a precisely knownmotion to be generated. Sixteen field long interlaced sequences weregenerated from an image for different motion speeds. The simulationsuggests that the basic gradient motion estimation algorithm gives thecorrect motion vector with a (standard deviation) measurement error ofabout ±¼ pixel/field. The measured velocity at the edge of the picturegenerally tends towards zero because the filters used are not whollycontained within the image. Occasionally unrealistically high velocitiesare generated at the edge of the image. The use of the normalisedconstraint equation gave similar results to the unnormalised equation.Use of the Martinez technique gave varying results depending on thelevel of noise assumed. This technique never made things worse and couldsignificantly reduce worst case (and average) errors at the expense ofbiasing the measured velocity towards zero. The estimates of the motionvector error were consistent with the true (measured) error.

EXAMPLE

This example provides a brief specification for a gradient motionestimator for use in a motion compensated standards converter. The inputfor this gradient motion estimator is interlaced video in either625/50/2:1 or 525/60/2:1 format. The motion estimator produces motionvectors on one of the two possible input standards and also anindication of the vector's accuracy on the same standard as the outputmotion vectors. The motion vector range is at least ±32 pixels/field.The vector accuracy is output as both a ‘spread vector’ and a‘confidence signal’.

A gradient motion estimator is shown in block diagram form in FIGS. 6 &7 above. Determination of the measurement error, indicated by ‘spreadvectors’ and ‘confidence’ are shown in FIGS. 9 & 12. The characteristicsof the functional blocks of these block diagrams is as follows:

Input Video:

4:2:2 raster scanned interlaced video.

luminance component only

Active field 720 pixels×288 or 244 field lines depending on inputstandard.

Luminance coding 10 bit, unsigned binary representing the range 0 to(2¹⁰−1)

Temporal Halfband Lowpass Filter (14):

Function: Temporal filter operating on luminance. Implemented as avertical/temporal filter because the input is interlaced. Thecoefficients are defined by the following matrix in which columnsrepresent fields and rows represent picture (not field) lines.${{Temporal}\quad {Halfband}\quad {filter}\quad {coefficients}} = {\frac{1}{8}\begin{bmatrix}1 & 0 & 1 \\0 & 4 & 0 \\1 & 0 & 1\end{bmatrix}}$

Input: 10 bit unsigned binary representing the range 0 to 1023(decimal).

Output: 12 bit unsigned binary representing the range 0 to 1023.75(decimal) with 2 fractional bits.

Vertical Lowpass Filter (12):

Function: Vertical intra field, 1/16^(th) band, lowpass, prefilter andanti-alias filter. Cascade of 3, vertical running sum filters withlengths 16, 12 and 5 field lines. The output of this cascade of runningsums is divided by 1024 to give an overall DC gain of 15/16. The overalllength of the filter is 31 field lines.

Input: As Temporal Halfband Lowpass Filter output.

Output: As Temporal Halfband Lowpass Filter output.

Horizontal Lowpass Filter (10):

Function: Horizontal, 1/32^(nd) band, lowpass, prefilter, Cascade of 3,horizontal, running sum filters with lengths 32, 21 and 12 pixels. Theoutput of this cascade is divided by 8192 to give an overall DC gain of63/64. The overall length of the filter is 63 filters.

Input: As Vertical Lowpass Filter output.

Output: As Vertical Lowpass Filter output.

Temporal Differentiator (16):

Function: Temporal differentiation of prefiltered luminance signal.Implemented as a vertical/temporal filter for interlaced inputs.${{Temporal}\quad {Differentiator}\quad {coefficients}} = {\frac{1}{4}\begin{bmatrix}1 & 0 & {- 1} \\0 & 0 & 0 \\1 & 0 & {- 1}\end{bmatrix}}$

Input: As Horizontal Lowpass Filter output.

Output: 12 bit 2's complement binary representing the range −2⁹ to(+2⁹−2⁻²).

Horizontal Differentiator (17):

Function: Horizontal differentiation of prefiltered luminance signal 3tap horizontal filter with coefficients ½(1, 0, −1) on consecutivepixels.

Input: As Horizontal Lowpass Filter output.

Output: 8 bit 2's complement binary representing the range −2⁴ to(+2⁴−2⁻³).

Vertical Differentiator (18):

Function: Vertical differentiation of prefiltered luminance signal, 3tap, intra-field, vertical filter with coefficients ½(1, 0, −1) onconsecutive field lines.

Input: As Horizontal Lowpass Filter output.

Output: 8 bit 2's complement binary representing the range −2⁴ to(+2^(<)−2⁻³).

Compensating Delay (19):

Function: Delay of 1 input field.

Input & Output: As Horizontal Lowpass Filter output.

Vertical/Temporal Interpolators (20):

Function: Conversion between input and output scanning standards.Cascade of intra field, 2 field line linear interpolator and 2 fieldlinear interpolator, i.e. a vertical/temporal bi-linear interpolator.Interpolation accuracy to nearest 1/32^(nd) field line and nearest1/16th field period.

Inputs: as indicated in FIG. 6 and specified above.

Outputs: same precision as inputs.

Orientation of spatial gradient vector of image brightness. 12 bitunipolar binary spanning the range 0 to 2π i.e. quantisation step in2π/2¹². This is the same as 2's complement binary spanning the range −πto +π.

|∇I|: Magnitude of spatial gradient vector of image brightness. 12 bitunipolar binary spanning the range of 0 to 16 (input grey levels/pixel)with 8 fractional bits.

n: Noise level of |∇I| adjustable from 1 to 16 input grey levels/pixel.

vn: Motion vector of current pixel in direction of brightness gradient.12 bit, 2's complement binary clipped to the range −2⁶ to (+2⁶−2⁻⁵)pixels/field.

Polar to Rectangular Co-ordinate Converter (40):

Inputs: as vn & θ above

Outputs: 12 bit, 2's complement binary representing the range −2⁶ to(+2⁶−2⁻⁵)

Lookup Tables No. 5 & No. 6 (FIGS. 7 and 9)

Function: Cosine and Sine lookup tables respectively.

Inputs: as θ above.

Outputs: 12 bit, 2's complement binary representing the range −1 to(+1−2^(−11).)

Region Averaging Filters (38,39,42):

Function: Averaging signals over a region of the image. 95 pixels by 47field lines, intrafield, running average filter.

Inputs & Outputs: 12 bit 2's complement binary.

Spatial Interpolators (44):

Function: Converting spatial scanning to allow for picture resizing.Spatial, intrafield bilinear interpolator. Interpolation accuracy tonearest 1/32nd field line and nearest 1/16th pixel.

Inputs: 12 bit 2's complement binary.

Outputs: 12 or 8/9 bit 2's complement binary.

Upper Interpolators feeding multipliers 12 bit.

Lower Interpolators feeding Lookup tables 8/9 bit (to ensure a practicalsize table).

Look Up Tables 7 to 9 (FIG. 7):

Function: Calculating matrix ‘Z’ defined in equation 14 above.

Parameters n₁ & n₂ adjust on test (approx. 0.125).

Inputs: 8/9 bit 2's complement binary representing −1 to (approx.) +1.

Outputs: 12 bit 2's complement binary representing the range 16 to(+16−2−5).

Multipliers & Accumulators.

Inputs & Outputs: 12 bit 2's complement binary.

Motion Vector Output:

Output of FIG. 7.

Motion vectors are measure in input picture lines (not field lines) orhorizontal pixels per input field period.

Motion speeds are unlikely to exceed 35 48 pixels/field but an extra bitis provided for headroom.

Raster scanned interlaced fields.

Active field depends on output standard: 720 pixels×288 or 244 fieldlines.

12 bit signal, 2's complement coding, 8 integer and 4 fractional bitsrepresenting the range −128 to (+128−2′)

Spread Vectors S₁ and S₂ (Output of FIG. 12):

Spread vectors represent the measurement spread of the output motionvectors parallel and perpendicular to edges in the input image sequence.

The spread vectors are of magnitude σ (where σ represents standarddeviation) and point in the direction of the principle axes of theexpected distribution of measurement error.

Each spread vector has two components each coded using two complementfractional binary representing the range −4 to (+4−2⁻⁷).

Confidence Output:

Output of FIG. 12, derivation of confidence signal described above,

The confidence signal is an indication of the reliability of the ‘OutputMotion Vector’. Confidence of 1 represents high confidence, 0 representsno confidence.

The confidence signal uses 8 bit linear coding with 8 fractional bitsrepresenting the range 0 to (1−2⁻⁸).

REFERENCES

1. Aggarwal, J. K. & Nandhakumar, N. 1988. On the computation of motionfrom sequences of images—a review. Proc. IEEE, vol. 76, pp. 917-935,August 1988.

2. Bierling, M., Thoma, R. 1986. Motion compensating field interpolationusing a hierarchically structured displacement estimator. SignalProcessing, Volume 11, No. 4, December 1986, pp. 387-404. ElsevierScience publishers.

3. Borer, T. J., 1992. Television Standards Conversion, Ph.D. Thesis.Dept. Electronic & Electrical Engineering, University of Surrey,Guildford, Surrey, GU2 5XH, UK. October 1992.

5. Cafforio, C., Rocca, F. 1983. The differential method for imagemotion estimation. Image sequence processing and dynamic scene analysis(ed. T. S. Huang). Springer-Verlag, pp 104-124, 1983.

6. Cafforio, C., Rocca, F., Tubaro, S., 1990. Motion Compensated ImageInterpolation. IEEE Trans. on Comm. Vol. 38, No. 2, February 1990,pp215-222.

7. Dubois, E., Konrad, J., 1990. Review of techniques for motionestimation and motion compensation. Fourth International Colloquium onAdvanced Television Systems, Ottawa, Canada, June 1990. Organised by CBCEngineering, Montreal, Quebec, Canada.

8. Fennema, C. L., Thompson, W. B., 1979. Velocity determination inscenes containing several moving objects. Computer Vision, Graphics andImage Processing, Vol. 9, pp. 301-315, 1979.

9. Huahge, T. S., Tsai, R. Y., 1981. Image sequence analysis: Motionestimation. Image sequence analysis, T. S. Huange (editor),Springer-Verlag, Berlin, Germany, 1981, pp. 1-18.

10. Konrad, J., 1990. Issues of accuracy and complexity in motioncompensation for ATV systems. Contribution to ‘Les Assises Des JeunesChercheurs’, CBC, Montreal, June 1990.

11. Lim, J. S., 1990. Two-dimensional signal and image processing.Prentice Hall 1990, lSBN 0-13-934563-9, pp 497-511.

12. Martinez, D. M. 1987. Model-based motion estimation and itsapplication to restoration and interpolation of motion pictures. RLETechnical Report No.530. June 1987. Research Laboratory of Electronics,Massachusetts Institute of Technology, Cambridge, Mass. 02139 USA.

13. Netravali, A. N., Robbins, J. D. 1979. Motion compensated televisioncoding, Part 1. Bell Syst. Tech. J., vol. 58, pp 631-670, March 1979.

14. Paquin, R., Dubois, E., 1983. A spatio-temporal gradient method forestimating the displacement vector field in time-varying imagery.Computer Vision, Graphics and Image Processing, Vol. 21, 1983, pp205-221.

15. Robert, P., Cafforio, C., Rocca, F., 1985. Time/Space recursion fordifferential motion estimation. Spie Symp., Cannes, France, November1985.

16. Thomson, R. 1995. Problems of Estimation and Measurement of Motionin Television. I.E.E. Colloquium on motion reproduction in television.I.E.E Digest No: 1995/093, May 3, 1995.

17. Vega-Riveros, J. F., Jabbour, K. 1986. Review of motion analysistechniques. IEE Proceedings, Vol. 136, Pt I., No. 6, December 1989.

18. Wu, S. F., Kittler, J., 1990. A differential method for thesimultaneous estimation of rotation, change of scale and translation.Image Communication, Vol. 2, No. 1, May 1990, pp 69-80.

19. Montgomery, D C, Peck, E. A., 1992. Introduction to linearregression analysis. Second Edition. John Wiley & Sons, Inc. ISBN0-471-53387-4.

What is claimed is:
 1. Video or film signal processing apparatuscomprising: a motion estimation apparatus for generating best fit motionvectors, each best fit motion vector corresponding to a region of aninput signal, a means for calculating, for each of said regions of theinput signal, a plurality of spatial and temporal image gradients, ameans for calculating, for each said best fit motion vector, a pluralityof error values corresponding to said plurality of image gradients, ameans for calculating a plurality of error vectors from said pluralityof error values, a logic means adapted to calculate for each motionvector an estimate of the distribution of vector measurement errors incalculating said best fit motion vector, and a means adapted togenerate, for each said motion vector, an indication of the motionvector measurement error derived from said estimate.
 2. Video or filmsignal processing apparatus as claimed in claim 1, wherein said logicmeans provides, for each motion vector, a statistical analysis of theerror in the constraint equation for each of a plurality of pixels in aregion of the input signal.
 3. Video or film signal processing apparatusas claimed in claim 1, wherein said logic means provides, for eachmotion vector, a matrix representing the dimensions and orientation ofthe error distribution.
 4. Video or film signal processing apparatus asclaimed in claim 1, wherein the apparatus includes means for calculatingthe elements of an error matrix from said error vectors, said matrixrepresenting the distribution of motion vector measurements errors. 5.Video or film signal processing apparatus as claimed in claim 4, whereinthe apparatus further includes means for performing an eigenvectoranalysis on said error matrix or on on covariance matrix for themeasured motion vector.
 6. Video or film signal processing apparatuscomprising: a motion estimation apparatus for generating best fit motionvectors, each best fit motion vector corresponding to a region of aninput signal, a means for calculating, for each of said regions of theinput signal, a plurality of spatial and temporal image gradients, ameans for calculating, for each said best fit motion vector, a pluralityof error values corresponding to said plurality of image gradients, ameans for calculating a plurality of error vectors from said pluralityof error values, a logic means adapted to calculate for each motionvector, an estimate of the distribution of vector measurement errors incalculating said best fit motion vector, and a means adapted togenerate, for each said vector, an indication of the motion vectormeasurement error derived from said estimate, wherein the apparatusincludes means for calculating the standard deviation of the error inthe constraint equations for each of a plurality of pixels in a regionof said input signal, and means for estimating the error in measuringthe motion vector using the resultant standard deviation, and wherein anestimate of the covariance matrix for the measured motion vector isgenerated, the covariance matrix having vector components.
 7. Video orfilm signal processing apparatus as claimed in claim 6, wherein theapparatus further includes means for performing an eigenvector analysison said error matrix or on said covariance matrix.
 8. A method of videoor film signal processing comprising the steps of: generating best fitmotion vectors corresponding to plural regions of an input signal,calculating, for each of said plural regions of the input signal, aplurality of spatial and temporal image gradients, calculating, for eachsaid best fit motion vector, a plurality of error values correspondingto said plurality of image gradients, calculating a plurality of errorvectors from said plurality of error values, calculating, for eachmotion vector, an estimate of the distribution of vector measurementerrors in calculating said best fit motion vector, and generating, foreach said motion vector, an indication of the motion vector measurementerror derived from said estimate.
 9. A method of video or film signalprocessing apparatus as claimed in claim 8, wherein said step ofcalculating provides, for each vector, a statistical analysis of theerror in the constraint equation for each of said pixels.
 10. A methodof video or film signal processing as claimed in claim 8, wherein saidstep of calculating provides, for each motion vector, a matrixrepresenting the dimensions and orientation of the error distribution.11. A method of video or film signal processing as claimed in claim 8,wherein the method includes the step of calculating the elements of anerror matrix from said error vectors said matrix representing thedistribution of motion vector measurement errors.
 12. A method of videoor film signal processing as claimed in claim 11, wherein the methodfurther includes performing an eigenvector analysis on said error matrixor on a covariance matrix for the measured motion vector.
 13. A methodof video or film signal processing as claimed in claim 8, wherein themethod includes calculating the standard deviation of the error in theconstraint equations for each of a plurality of pixels in a region ofsaid input signal, and estimating the error in measuring the motionvector using the resultant standard deviation, whereby an estimate ofthe covariance matrix for the measured motion vector is generated.
 14. Amethod of video or film signal processing as claimed in claim 13,wherein the method further includes performing an eigenvector analysison said error matrix or on said covariance matrix.